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Trypsin is a protease which cleaves to the carboxyl side of lysine and argenine residues. Trypsin is pro- 
duced in the form of a precursor or zymogen molecule called trypsinogen, Trypsinogen is converted to trypsin 
by the action of enteropeptidase. 

The substrate specificity of trypsin provides a useful enzyme for conversion of biosynthetically produced 

5 molecules to preferred molecules. An example is the conversion of proinsulin to insulin via trypsin mediated 
removal of the connecting peptide. Trypsin is commercially available and is produced primarily by isolation from 
the pituitary glands of a variety of species. Bovine and porcine pancreases are particularly common sources 
of trypsin. Purification procedures utilized to purify trypsin for later use in byconversion processes aim to re- 
move undesirable copurifying proteases from the desired trypsin product. 

10 Notwithstanding much effort at purification, various lots of trypsin contain variable amounts of contamin- 

ating proteases. Chymotrypsin is frequently present in minimal amounts in trypsin production lots. The pres- 
ence of even a minor amount of a contaminating protease results in undesirable deavage of various products 
when only the trypsin mediated deavage is desired. Conversion of proinsulin to insulin via the action of trypsin 
is thus complicated by contaminants of other proteases. The present invention solves the problem of contanv 

15 inating protease contamination n by providing recombinant DNA expression systems for the biosynthetic pro- 
duction of bovine trypsin and trypsinogen. Thus.the present invention represents a significant advance in the 
art of typsin and trypsinogen production thereby greatly facilitating bio-conversion of precursor molecules. 

The present invention disdoses and claims DNA sequences which encode bovine trypsin and trypsinogen. 
Expression vectors useful for producing trypsin and trypsinogen are also disdosed and claimed as are host 

20 cells transformed with these expression vectors. The expression vectors and host cells of the present invention 
provide a convenient source for trypsin and trypsinogen molecules, free of contaminating proteases which 
disrupt biosynthetic conversation processes, 

A series of figures are provided to further understanding of the invention. Figure 1 is a restriction site and 
function map of plasmid pRMG4. Figure 2 is a restriction site and function map of plasmid pRMG5. Figure 3 is 

25 a restriction site and function map of plasmid pRMG6. Figure 4 is a restriction site and function map of plasmid 
pRMG7. Figure 5 is a restriction site and function map of plsmid pHKY390. 

The ability to produce trypsin either by direct expression or by production of the zymogen, trypsinogen 
affords flexibility in the isolation, purification and folding of trypsin by allowing the initial steps of trypsin pro- 
duction to be performed on an enzymatically inactive form. 

30 The expression vectors provided by the instant invention were prepared by replacing the kanamycin phos- 
photransferase coding region of plasmid pHKY390 with chemically synthesized double-stranded DNA encod- 
ing trypsin or trypsinogen. Plasmid pHKY390 was deposited with the Northern Regional Research Laboratory 
(N.R.R.L.), Peoria, IL USA on January 17, 1992, where it is available under the accession number NRRL B- 
18885. Plasmid pHKY390 was deposited in the E. coli host strain RV308. 

35 The chemically synthesized genes encoding trypsin and trypsinogen were prepared on an Applied Bio- 

systems DNA synthesizer using p-cyanoethyl phosphoramidite chemistry. A series of 20 oligonudeotides was 
synthesized as described in Example 1. The appropriate oligonudeotides were then annealed and ligated to 
generate double stranded DNA molecules encoding bovine trypsin and bovine trypsinogen. The double strand- 
ed DNA sequence which was prepared to encode bovine trypsin is provided below as Formula 1. The amino 

40 acid sequence encoded by the corresponding DNA is provided below the oligonudeotide sequence. Sequence 
I.D. 21 , which is provided in a later section of this disclosure, corresponds to the sense strand of the sequence 
provided in Formula I. Sequence I.D. 22 corresponds to the amino acid sequence of Formula 1. The oligonu- 
cleotide sequences, which flank the coding sequence, are designated by lower case letters and the stop co- 
don.TAG is designated as END in the amino acid sequence provided below the oligonudeotide sequence of 

45 Formula 1. 
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Formula I 



N N 

d a 

e r 

I I 
cat ATG ATC GTTGG CGGCT AC AC CTGTGGCGCC AAT ACC GTCC CGT ACC AGGTGTC C CTG 
+ * + ♦ + + 

gtaTACTAGCAACCGCCGATGTGGACACCGCGGTTATGGCAGGGCATGGTCCACAGGGAC 
MetlleValGlyGlyTyrThrCysGlyAlaAsnThrValProTyrGlnValSerLeu 

AATTCTGGCTACCACTTCTGTGGTGGCTCCCTCATCAACTCCCAGTGGGTGGTATCAGCG 
+ + + + + + 

TTAAGACCG ATGGTG AAG AC ACCACCG AGGGAGTAGTTG AGGGTC AC CC ACC ATAGTCGC 
AsnSerGlyTyrHisPheCysGlyGlySerLeuIleAsnSerGlnTrpValValSerAla 



GCCCACTCCTACAAGTCCGGCATCCAGGTGCGTCTGGGCGAGGATAACATCAACGTCGTG 
- - + + + ♦ + 

CGGGTGACGATGTTCAGGCCGTAGGTCCACGCAGACCCGCTCCTATTGTAGTTGCAGCAC 
AlaHisCysTyrLysSerGlylleGlnValArgLeuGlyGluAspAsnlleAsnValVal 

* 

A 

P 
a 

L 

I 

G AGGGC AATG AGC AG TTC ATCTC CGC ATCC AAGTC C ATCGTGC AC C CGTC CT AC AACTC C 
+ + + + + + 

CTCCCGTTACTCGTCAAGTAGAGCCGTAGGTTCAGGTAGCACGTGGGCAGGATGTTGAGG 
GluGlyAsnGluGlnPhelleSerAlaSerLysSerlleValHisProSerTyrAsnSer 

AACACTCTGAACAA?GACATCATGCTGATCAAGCTCAAGTCCGCCGCATCCCTGAACTCC 
+ + + + + 

TTGTGAGACTTGTTACTGTAGTACGACTAGTTCGAGTTCAGGCGGCGTAGGGACTTGAGG 
AsnThrLeuAsnAsnAspIleMetLeuIleLysLeuLysSerAlaAlaSerLeuAsnSer 

cgcgtggcctccatctctctgccx;acctcctgtgcctccgccggcacgcagtgcctcatc 

+ -¥ + + + + 

gcgcaccggaggtagagagacggctggaggacacggagccggccgtgcgtcacggagtac 

ArgValAlaSerlleSerLeuProThrSerCysAlaSerAlaGlyThrGlnCysLeuIle 

TCTCGCTGGGGC AAC ACT AAG AGCTCTGGC ACCTCC TACC C AG AC GTGCTG AAGTGCCTG 
---««--+-—---«-♦ + + + + 

AGAC CX5 AC CCCGTTGTG ATTCTCG AGACCGTGG AGG ATGGGTCTGC ACG ACTTC ACGGAC 

SerClyTrpGlyAsnThrLysSerSerGlyThrSerTyrProAspValLeuLysCysLeu 



3 
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B 
a 

1 
I 

AAGGCTCCTATCCTC AGCG ATTCCTCCTGTAAGTCCGC CT AC CCTGGCC AG ATTAC C AGC 
421 + ~ + + ._ + + + 480 

TTCCGAGGATACXJACTCGCTAAGGAGGACATTCAGGCGGATGGGACCGGTCTAATGGTCG 
LyaAlaProIleLeuSerAspSerSerCysLysSerAlaTyrProGlyGlnlleThrSer - 

AACATCTTCTGTCCCGCCTACCIX^AGGGCGGCAAGGATTCCTGTCAG^ 
481 ♦ «. + + 540 

TTGTACAAGACACGGCCCATCCACCTCCCGCCGTTCCTAAGGACAGTCCCACTAAGACCA 

AsnMetPheCysAlaGlyTyrLeuGluGlyGlyLysAspSerCysGlnGlyAspS«rGly - 



GGCCCTGTGGTXrrcCTCCGGCAAGCTCCAAGGCATCCTCTCCTGGGGTTCCGGCTGTGCC 

541 + + — " + +" — + ♦ 600 

CCGGGACACCAGACGAGGCCGTTCGAGGTTCCGTAGCAGAGGACCCCAAGGCCGACACGG 
GlyProValValCysSerGlyLysLeuGlnGlylleValSerTrpGlySerGlyCysAla - 

CAGAAGAACAAGCCTGGCGTCTACACCAAGGTCTGTAACTATGTGTCCTGGATTAAGCAG 
601 + + + 66Q 

GTCTTCTTGTTCGGACCGCAGATGTGGTTCCAGACATTGATACACAGGACCTAATTCGTC 
GiaLysAsnLysProGXyValTyrThrLysValCysAsnTyrValSerTrpIleLysGln - 

B 
a 

m 
H 
I 

ACCATAGCTTCCAATtaggatcc 
661 + + 683 

TGGT ATCG AAGGTTAa t c c t a gg 
ThrlleAlaSerAsnEnd 



The double stranded sequence encoding bovine trypsin is provided to add detail to the single stranded 
format required in the Sequence Identification section of this disclosure. Restriction endo nuclease recognition 
sites are provided above the sequence as appropriate; the amino acid encoded by each cod on is presented 
below the DNA sequence; and the nucleotides forming the flanking regions of the coding region are provided 
to illustrate via restriction endonuclease recognition sites and linkers the manner whereby the coding sequence 
was inserted into the expression vectors. 

The DNA sequence synthesized to comprise a bovine trypsinogen encoding region is provided below as 
Formula II. The format is similar to that provided above for the region encoding bovine trypsin( Formula I). Se- 
quence I.D. 23, which is provided in a later section of this disclosure, corresponds to the coding sequence of 
Formula II while Sequence I.D. 24 provides the amino acid sequence encoded thereby. The oligonucleotide 
sequences, which flank the coding sequence, are designated by lower case letters and the stop codon.TAG 
is designated as END in the amino acid sequence provided below the oligonucleotide sequence of Formula 
II. 
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Formula II 



N N 
d a 
e r 

I I 
cat ATGGTGG ATG ATG ATGATAAGATCGTTGGCGGCTACACCTGTGGCGCC AATACCGTC 

+ + + + + + 

g t a TAC C AC CT ACT ACT ACT ATTCTAGC AACCGC CG ATCTGG AC ACCGCGGTT ATGGC AG 
MotValAspAspAspAspLysIlcValGlyGlyTyrThrCysGlyAliiAsnThrVal 



CCGTACCAGGTGTCCCTGAATTCTGGCTACCACTTCTGTGGTGGCTCCCTCATCAACTCC 

„, + + + + + 

GGCATGGTCCACAGGGACTTAAGACCCATGCTCAAGACACCACCGAGGCAGTAGTTGAGG 

ProTyrG InVa lSerLeoAsnSerGlyTy rHis PheCy sGlyGlySerLeu 1 1 eAsnSer 

C AGTGGGTGGT ATCAGCGGCC C ACTGCTAC AAGTCCGGC ATC C AGGTGCGTCTGGGCGAG 

+ + + + + + 

GTCACCCACCATAGTCGCCGGGTGACGATGTTCAGGCCGTAGGTCCACGCAGACCCGCTC 
GlnTrpValValSerAlaJVlaHisCy3TyrLysSerGlyIl*GlnValArgL©uGlyGlu 
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A 

P 
a 

L 

I 

GATAACATCAACGTCGTGGAGGGCAATGAGCAGTTCATCTCCGCATCCAAGTCCATCGTG 
+ + + -+ + 240 

CTATTGTAGTTG^AGCACCTCCCGTTACTCGTCAACTAGAGGCGTAGGTTCAGGTAGCAC 
AspAsrilleAsnValValGluGlyAsnGluGlnPhelleSerAlaSerLysSerlleVal - 

C ACC C GTCCTAC AAC TC C AAC ACTCTG AAC AATG AC ATC ATCCTG ATC AAGCTC AAGTCC 

+ + + + -+ + 300 

GTGGGC AGG ATGTTG AGGTTGTG AG AC TTGTT ACTGT AGT ACG ACT AGTTCG AGTTC AGG 
HisProSerTyrAsnSerAsnThrLeuAsnAsnAspIleMetLeuIleLysLeuLysSer - 

GCCGCATCCCTGAACTCCCGCGTGGCCTCCATCTCTCTGCCGACCTCCTGTGCCTCCGCC 

+ + +■ + +« + 360 

CGGCGTAGGGACTTGAGGGCGCACCGGAGGTAGAGAGACGGCTGGAGGACACGGAGGCGG 
AlaAlaSarLeuAsnSerArgValAlaSerlleSerLeuProThrSerCysAlaSerAla - 



GGCACGCAGTGCCTCATCTCTGGCTGGGGCAACACTAAGAGCTCTGGCACCTCCTACCCA 
+ + + + + 420 

CCGTCCGTCACGGAGTAGAGACCGACCCCGTTGTGATTCTCGAGACCGTGGAGGATGGGT 
GlyThrGlnCysLeuI leSerGlyTrpGlyAsnThrLysSerSerGlyThrSerTyrPro - 

G ACGTGC TG AAGTGC CTG AAGGCTC CT ATC CTG AGCG ATTC CTCC TGTAAGTCCGCCTAC 
+ + + + + + 480 

CTGCACGACTTCACGGACTTCCGAGGATAGGACTCGCTAAGGAGGACATTCAGGCGGATG 

AspValLeuLysCysLeuLysAlaProIleLeuSerAspSerSerCysLysSerAlaTyr - 

8 
a 
1 
I 

C CTGGC C AG ATTACC AGC AAC ATGTTCTGTG C C GGCT AC CTGG AGGGCGGC AAGGATTC C 
+ + + + + ♦ 540 

GG ACCGGTCT AATGGTCGTTGTAC AAG AC A CGGCCG ATGG ACCTCC CGC CGTTC CT AAGG 
ProGlyGlnlleThrSerAsnMetPheCysAlaGlyTyrLeuGluGlyGlyLysAspSer - 

TGTC AGGGTGATTCTGGTGGCC CTTGTGGTCTGC TC CGGC AAGCTCC AAGGCATCGTCTC C 

+ + + + + + 600 

ACAGTCCCACTAAGACCACCGGGACACCAGACGAGGCCGTTCGAGGTTCCGTAGCAGAGG 
CysGlnGlyAspSerGlyGlyProValValCysSerGlyLysLeuGlnGlylleValSer - 

TGGGGTTCCGGCTGTGCC C AG AAG AAC AAGC CTGGCGTCT AC ACC AAGGTCTCT AACT AT 
, + + + + — + + 660 

ACC CC AAGGCCGAC ACGGGTCTTCTTGTTCGG AC CGC AG ATGTGGTTC C AGAC ATTG AT A 
TrpGlySerGlyCysAlaGlnLysAsnLysProGlyValTyrThrLysValCysAsnTyr - 



a 

m 

H 
I 

GTGTC CTGGATTAAGC AG AC C AT AGCTTCC AATt a g g a t cc 
+ * ♦ + - 701 

C ACAGGACCTAATTCGTCTGGTATCGAAGGTTAat CCt agg 
ValSarTrpIleLysGlnThrlldAlaSerAsnEnd 



6 
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The gene for bovine trypsin was prepared by assembling subsets of the oligonucleotides described in Ex- 
ample 1 into three separate cassettes priorto combining the three cassettes to form the full length bo vine tryp- 
sin encoding gene. Oligonucleotides BT1-6 were annealed and inserted into the commerciaJly available vector, 
pBluescript SK+ (Stratagene). Oligonucleotide sequences BT7-12 were likewise annealed and inserted into a 
5 pBluescript SK+ cloning vector. The third cassette was generated upon ligation of oligonucleotides BT13-18 
and insertion into a third pBluescript SK+ cloning vector. The three cassettes encoding portions of the bovine 
trypsin gene each have a Hind III termini and an Xba l termini. The bovine trypsin encoding sequence was syn- 
thesized as three separate components to minimize the chance for spontaneous mutations occurring within 
the sequence. The cloning vector comprising oligonucleotides BT1-6 is designated pRMG1. The cloning vector 
10 comprising oligonucleotides BT7-12 is designated plasmid pRMG2. The cloning vector comprising oligonucleo- 
tides BT13-18 is designated pRMG3. The three portions of the bovine trypsin encoding sequence were pre- 
pared by digesting plasmids pRMG1 , pRMG2, and pRMG3 with appropriate endo nucleases followed by ligation 
of the fragments and insertion into an expression vector. The expression vector utilized in the construction of 
trypsin and trypsinogen expression vectors is designated plasmid pHKY390. Plasmid pHKY390 has been de- 
ls posited in the Northern Regional Research Laboratory, Peoria, IL where it is publicly available under the ac- 
cession number B-1885. 

A restriction site and function map of plasmid pHKY390 is provided in Figure 5. Plasmid pHKY390 was orig- 
inally used as a promoter probe wherein promoters were evaluated for their ability to cause transcription of 
the kanamycin phosphotransferase gene of plasmid pHKY390. Reference to Figure 5 reveals that an Nde l and 

20 BamH I site are conveniently located in plasmid pHKY390 for insertion of a sequence encoding a polypeptide 
product of interest . Plasmid pRMG4 was constructed by insertion of the trypsin encoding gene into the 
Nde l/ Bam Hl digested plasmid pHKY390. The three fragments which upon ligation generate the trypsin encod- 
ing gene were prepared as described in Example 4. A restriction site and function map of plasmid pRMG4 is 
provided in Figure 1. Plasmid pRMG4 utilizes a modified lambda pL promoter, p97, to drive transcription of a 

25 two cistron message wherein the second cistron encodes bovine trypsin. Plasmid pRMG4 uses a tetracyline 
resistance gene as a selectable marker. The temperature sensitive lambda pL repressor, c1857, is utilized to 
provide regulatable transcription from the modified lambda promoter. The origin of replication utilized in plasmid 
pRMG4 was prepared originally from plasmid pBR322. Plasmid pRMG4 also utilizes a rop gene. The rop gene 
provides a vector copy number of approximately fifteen to twenty when utilized, as in the vectors of the present 

30 invention, with a pBR322-derived origin of replication. 

Plasmid pRMG7 is the preferred expression vector for bovine trypsinogen. Reference to Figures 1 and 4 
and the examples indicates the high level of similarity between the preferred expression vectors for bovine 
trypsin and bovine trypsinogen. Accordingly the description of the elements in plasmid pRMG4 is likewise ap- 
plicable to plasmid pRMG7. 

35 A variety of E cofi host cells were utilized in the construction of the vectors and expression systems of 

the present invention. E cofi RV308 is available from the Northern Regional Research Laboratory, Peoria, IL 
(NRRL) under the accession number NRRL B- 15624 E coli MM294 is available from the American Tissue Cul- 
ture Collection, Parklawn , MD (ATCC) under the accession number ATCC 31446. The inability of either of 
these strains to support expression of bovine trypsin or bovine trypsinogen from plasmids pRMG4 and pRMG7 

40 respectively underscores the unpredictability, which remains in the art of molecular biology. The reason or rea- 
sons why such well recognized E coli host strains were incapable of achieving expression of trypsin and tryp- 
sinogen remains unelucidated. Digestion of either the messenger RNA or the desired protein product could 
account for the failure to. affect expression in these strains. E cofi L687, a Ion- host cell, was eventually tried 
and this host cell strain proved to be competent for expression of bovine trypsin and bovine trypsinogen from 

45 pRMG4 and pRMG7 respectively. E coli L6B7 was deposited in the NRRL where it is available under the ac- 
cession number B-18884. Accordingly, E co//L687 transformed with plasmids pRMG4 and pRMG7 comprise 
the respective best modes for producing bovine trypsinogen and bovine trypsin in prokaryotic cells. The media 
utilized in the fermentative production of the enzyme and zymogen of the present invention affect the overall 
production levels of the desired products. L-broth is the preferred media for such fermentation processes. The 

so components of L-broth are 1 % (w/v) Bacto tryptone; 0.5% (w/v) Yeast extract; 0.5% (w/v) NaCI; and 0.1 % (w/v) 
dextrose at pH 7.0. L-agar is L-broth solidified with 1.5% (w/v) Bacto agar. 

The expression products of plasmids pRMG4 and pRMG7 have been established by conventional biochem- 
ical methodologies to be bovine trypsin and trypsinogen respectively. The availability of trypsin, whether ex- 
pressed directly or converted from its zymogen precursor, provides a significant advantage in biochemical con- 

55 version processes such as the removal of the connecting peptide of insulin. The source of enzyme devoid of 
contaminating proteases allows substantially greater flexibility in the production of important medicinal poly- 
peptides such as insulin. The biosynthetic source of the enzyme also eliminates any concerns related to the 
use of enzymes prepared from animal sources in the production of molecules which will be administered to 

7 
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humans or animals. 

The examples which follow are intended to further illustrate the present invention and are not to be inter- 
preted as limiting on the scope thereof. While the examples and detailed description sections of the present 
invention are sufficient to guide anyone of ord inary skill in t he art in the practice of the present invention, skilled 
artisans are also directed to Molecular Cloning A Laboratory Manual Second Edition, Sambrook,J., Fritsch, E. 
F. ( and Maniatis, T, Cold Spring Harbor Press 1989 and Current Protocols In Molecular Biology, Ausubel, F.M., 
Brent.R., Kingston.R.E., Moore, D. D., Seidman.J.G., Smith, JA,and Struhl, K.,Ed. Greene Publishing Asso- 
ciates and Wiley-lnterscience 1989. The aforementioned resources provide an excellent technical supplement 

to any discourse in genetic engineering. 

The examples provide sources for reagents, however it will be understood that numerous vendors market 
reagents of high quality for use in the protocols and procedures described below and the substitution of re- 
agents or protocols is contemplated by the present invention and embraced in the scope thereof. All temper- 
atures unless otherwise noted are expressed in degrees Centigrade. All percentages are on a weight per weight 
basis unless otherwise noted. 

Example 1 
* 

Oligonucleotide synthesis and purification 

The following oligonucleotides were synthesized on an Applied Biosystems (Foster City. CA) model 380B 
DNA synthesizer using beta-cyanoethyl phosphoramidite chemistry according to the manufacturer's instruc- 
tions. The single stranded DNA segments were conventionally purified on 12% polyacrylamide-7M urea gels 
and resuspended in water. 



BT1 , (Sequence I.D. 1) (Sequence Length: 77) 

5 * AGCTTCATATGATCGTTGGCGGCTACACCTGTGGCGCCAATACCGTCCCGTACCAGGTG 
TCCCTGAATTCTGGCTAC - 3 ' 



££2. (Sequence I.D 2) (Sequence Length: 77) 

5 ' AGTGGTAGCCAGAATTCAGGGACACCTGGTACGGGACGGTATTGGCGCCACAGGTGTAG 
CCGCCAACGATCATATGA-3 ' 

BT3 A (Sequence I.D. 3) (Sequence Length: 81) 

5 ' CACTTCTGTGGTGGCTCCCTCATCAACTCCCAGTGGGTGGTATCAGCGGCCCACTGCTA 
C AAGTCCGGC ATCC AGGTGCGT - 3 ' 

BT4A (Sequence I.D. 4) (Sequence Length: 81) 

5 ' CCAGACGCACCTGGATGCCGGACTTGTAGCAGTGGGCCGCTGATACCACCCACTGGGAG 
TTGATGAGGGAGCC ACCACAGA- 3 ' 

8 
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BT5 (Sequence I.D. 5) (Sequence Length: 73) 

5 ' CTGGGCGAGGATAACATCAACGTCGTGGAGGGCAATGAGCAGTTCATCTCCGCATCC AA 
GTCCATCGTGC ACT - 3 ' 

BT6 Sequence I.D. 6) (Sequence Length: 73) > 

5 ' CTAGAGTGCACGATGGACTTGGATGCGGAGATGAACTGCTCATTGCCCTCC ACGACGTT 
GATGTTATCCTCGC-3 ' 

BT7 {Sequence I.D. 7) (Sequence Length: 84) 

5 ' AGCTTCATCGTGCACCCGTCCTACAACTCCAACACTCTGAACAATGACATCATGCTGAT 
C AAGCTCAAGTCCGCCGC ATCCCTG - 3 ' 

BT8 (Sequence I.D. 8) (Sequence Length: 84) 

5 ' AGTTCAGGGATGCGGCGGACTTGAGCTTGATCAGCATGATGTCATTGTTCAGAGTGTTG 
GAGTTGTAGGACGGGTGC ACGATGA- 3 ' 

BT9 (Sequence I.D. 9) (Sequence Length: 93) 

5 ' AACTCCCGCGTGGCCTCCATCTCTCTGCCGACCTCCTGTGCCTCCGCCGGCACGCAGTG 
CCTCATCTCTGGCTGGGGC AAC ACTAAGAGCTCT - 3 ' 

J i 

RT1Q (Sequence I.D. 10) (Sequence Length: 93) 

5 ' TGCCAGAGCTCTTAGTGTTGCCCCAGCCAGAGATGAGGCACTGCGTGCCGGCGGAGGCA 
CAGGAGGTCGGCAGAGAGATGGAGGCCACGCGGG-3 ' 

9 
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BT11 (Sequence I.D. 11) (Sequence Length: 88) 

5 ' GGCACCTCCTACCCAGACGTGCTGAAGTGCCTGAAGGCTCCTATCCTGAGCGATTCCTC 
CTGTAAGTCCGCCTACCCTGGCC AG ATTT - 3 ' 

BT12 (Sequence I.D. 12) (Sequence Length: 88) 

5 ' CTAGAAATCTGGCCAGGGTAGGCGGACTTACAGGAGGAATCGCTC AGGATAGGAGCCTT 
CAGGCACTTC AGC ACGTCTGGGTAGGAGG - 3 ' 

BT13 (Sequence I.D. 13) (Sequence Length: 77) 

5 ' AGCTTCCTGGCCAGATTACCAGCAACATGTTCTGTGCCGGCTACCTGGAGGGCGGCAAG 
GATTCCTGCTAGGGTGAT- 3 ' 

BT14 (Sequence I.D. 14) (Sequence Length: 77) 

5 ' CAGAATCACCCTGACAGGAATCCTTGCCGCCCTCCAGGTAGCCGGCACAGAACATGTTG 
CTGGTAATCTGGCCAGG A- 3 ' 

BT15 (Sequence I.D. 15) (Sequence Length: 76) 

5 ' TCTGGTGGCCCTGTGGTCTGCTCCGGCAAGCTCCAAGGCATCGTCTCCTGGGGTTCCGG 
CTGTGCCC AGAAGAACA- 3 ' 

RT1 6 (Sequence I.D. 16) (Sequence Length: 76) 

5 ' GGCTTGTTCTTCTGGGCACAGCCGGAACCCCAGGAGACGATGCCTTGGAGCTTGCCGGA 
GCAGACCACAGGGCCAC- 3 ' 

10 
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£112 (Sequence I.D. 17) (Sequence Length: 74) 

5 ' AGCCTGGCGTCTACACCAAGGTCTGTAACTATGTGTCCTGGATTAAGCAGACCATAGCT 
TCCAATTAGGATCCT-3 ' 

. BT1B {Sequence I.D. 18) (Sequence Length: 74) 

5 ' CT AG AGG ATC C T AATTGG AAGCT ATG GTCTGCTT AATCC AGG AC AC AT AGT T AC AG AC C 
TTGGTGT AGACGCCA - 3 ' 

QUI (Sequence I.D. 19) (Sequence Length: 45} 

5 ' -T ATGGTGGATGATG ATGATAAG ATCGTTGGCGGCTACACCTGTGG - 3 ' 



(Sequence I.D. 20) (Sequence length: 45) 

5 ' -CGCCACAGGTGTAGCCGCCAACGATCTTATCATCATCATCCACCA-3 ' 



Example 2 

Construction of pRMG1 

A. Preparation of 231 base pair Hindlll- Xba l gene segment 

Six u.g of oligonucleotides BT2, BT3A, BT4A, and BT5 were individually phosphor ylated in 20 u.1 re- 
actions containing 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 5% glycerol, 100 u.M ade- 
nosine triphosphate, and 20 units T4 polynucleotide kinase (Boehringer Mannheim, Indianapolis, IN) at 
37°C for 30 min. The kinase was thermally inactivated by heating at 70°C for 5 min. 

Six ng of each of the above phosphor ylated oligonucleotides was mixed with 6 u.g (6ui) each of oli- 
gonucleotides BT1 and BT6, heated at 70* C for 5 min. and cooled to room temperature to allow the oli- 
gonucleotides to anneal. The annealed oligonucleotides were then treated with 30 units T4 DNA ligase 
(Boehringer Mannheim, Indianapolis, IN) in a 200 uJ reaction containing 50 mM Tris-HCI (pH 7.S), 10 mM 
MgCI 2 , 5 mM dithiothreitol, 5% glycerol.and 100 uM adenosine triphosphate for 1 hour at 20°C then 18 
hours at 15°C. 

The desired 231 base pair DNA fragment was conventionally purified on an 8% polyacrylamide gel 
and resuspended in water. Two u.g of the purified DNA fragment was treated with 20 units of T4 polynu- 
cleotide kinase in a 20 u.1 reaction containing 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 
5% glycerol, and 100 u.M adenosine triphosphate at 37°C for 30 min. 

B. Preparation of pBluescript SK+ vector 

Twenty ug of pi asm id pBluescript SK+ (Stratagene, LaJoila, CA) was digested to completion with 100 
units Hindlll (Boehringer Mannheim, Indianapolis, IN) and 100 units Xba l (Boehringer Mannheim, Indian- 
apolis, IN) in a 250 uJ reaction containing 50 mM Tris-HCI (pH 8.0), 10 mM MgCi 2l 50 mM NaCI, and 100 
\iQfm\ bovine serum albumin at 37°C for one hour. The enzymes were thermally inactivated by heating at 
70°Cfor10min. 

The 5' termini were dephosphorytated by treatment of the DNA with 5 units (5 calf intestinal alkaline 
phosphatase (Boehringer Mannheim, Indianapolis, IN) at 37°C for 30 min. The enzyme was thermally in- 
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activated by heating at 70°C for 15 min. The solution was extracted with an equal volume of phenol equi- 
librated with 100 mM Tris-HCI (pH 8.0). The aqueous layer was recovered and DNA was precipitated by 
the addition of 0.1 volume 3 M sodium acetate and 2.2 volumes of absolute ethanol.TheDNAwas collected 
by centrifugation and resuspended in 300 uJ water. 
5 C. Final construction of pRMGi 

1.3 ^g of the purified 231 base pair fragment prepared in Example 2A and 0.3 u.g of the pBluescript 
vector DNA prepared in Example 1B were ligated with 10 units of T4 DNA ligase in a 10 uJ reaction con- 
taining 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol. 5% glycerol.and 100 \iM adenosine 

triphosphate at 20°C for 18 hours. 
w A portion of the ligation mixture was used to transform E. coli K12 MM294 cells. Transformants were 

selected on Lagar containing 50 ug/ml ampiciilin. Ampicill in- resistant transformants containing the desired 
plasmid pRMG1 were identified following plasmid DNA purification by restriction enzyme site analysis and 
nucleotide sequencing. 

15 Example 3 

Construction of pRMG2 

A. Preparation of 265 base pair Hindlll-Xbal gene segment 

Six ug of oligonucleotides BT8, BT9, BT10, and BT11 were individually phosphorylated in 20 ul reao 
20 tions containing 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 5% glycerol, 100 uM adeno- 

sine triphosphate, and 20 units T4 polynucleotide kinase (Boehringer Mannheim, Indianapolis, IN) at 37°C 
for 30 min. The kinase was thermally inactivated by heating at 70°C for 5 min. 

Six ug of each of the above phosphorylated oligonucleotides was mixed with 6 ug (6uJ) each of oli- 
gonucleotides BT7 and BT12, heated at 70° C for 5 min. and cooled to room temperature to allow the oli- 
25 gonucleotides to anneal. The annealed oligonucleotides were then treated with 30 units T4 DNA ligase 

(Boehringer Mannheim, Indianapolis, IN) in a 200 ul reaction containing 50 mM Tris-HCI (pH 7.8), 10 mM 
MgCI 2i 5 mM dithiothreitol, 5% glycerol, and 100 uM adenosine triphosphate for 1 hour at 20°C then 18 
hours at 15°C. 

The desired 265 base pair DNA fragment was conventionally purified on an 8% polyacrylamide gel 
30 and resuspended in water. Two ug of the purified DNA fragment was treated with 20 units of T4 polynu- 

cleotide kinase in a 20 ul reaction containing 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 
5% glycerol, and 100 uM adenosine triphosphate at 37°C for 30 min. 

B. Final construction of pRMG2 

1 .3 u,g of the purified 265 base pair fragment prepared in Example 2A and 0.3 ug of the pBluescript 
35 vector DNA prepared in Example 1B were ligated with 10 units of T4 DNA ligase in a 10 ul reaction con- 

taining 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 5% glycerol, and 100 uM adenosine 

triphosphate at 20°C for 18 hours. 

A portion of the ligation mixture was used to transform E.coli K12 MM294 cells. Transformants were 
selected on Lagar containing 50 ug/ml ampiciilin. Ampicillin-resistant transformants containing the desired 
40 plasmid pRMG2 were identified following plasmid DNA purification by restriction enzyme site analysis and 

nucleotide sequencing. 

Example 4 

45 Construction of pRMG3 

A. Preparation of 227 base pair Hindlll-Xbal gene segment 

Sixug of oligonucleotides BT14, BT15, BT16, and BT17 were individually phosphorylated in 20 ul re- 
actions containing 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 5% glycerol, 100 uM ade- 
nosine triphosphate, and 20 units T4 polynucleotide kinase (Boehringer Mannheim, Indianapolis, IN) at 

so 37°C for 30 min. The kinase was thermally inactivated by heating at 70°C for 5 min. 

Six uxj of each of the above phosphorylated oligonucleotides was mixed with 6 fig (6ul) each of oli- 
gonucleotides BT13 and BT18; heated at 70° C for 5 min. and cooled to room temperature to allow the 
oligonucleotides to anneal. The annealed oligonucleotides were then treated with 30 units T4 DNA ligase 
(Boehringer Mannheim, Indianapolis, IN) in a 200 ul reaction containing 50 mM Tris-HCI (pH 7.8), 10 mM 

55 MgCl 2 , 5 mM dithiothreitol, 5% glycerol, and 100 uM adenosine triphosphate for 1 hour at 20°C then 18 

hours at 15°C. 

The desired 227 base pair DNA fragment was conventionally purif ied on an 8% polyacrylamide gel 
and resuspended in water. Two u,g of the purified DNA fragment was treated with 20 units of T4 potynu- 
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cleotide kinase in a 20 \x\ reaction containing 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 

5% glycerol, and 100 uM adenosine triphosphate at37°C for 30 min. 

B. FinaJ construction of pRMG3 

1.3 M.g of the purified 227 base pair fragment prepared in Example 3Aand 0.3 \xq of the pBlue script 
5 vector DNA prepared in Example 2B were ligated with 10 units of T4 DNA ligase in a 10 ui reaction con- 

taining 50 mM Tris-HCI (pH 7.8), 10 mM MgCl2, 5 mM dithiothreitol, 5% glycerol, and 100 jiM adenosine 

triphosphate at 20°C for 18 hours. 

A portion of the ligation mixture was used to transform E. coli K12 MM294 cells. Transformants were 

selected on L agar containing 50 pg/rnl ampicfllin. Ampicill in- resistant transformants containing the desired 
10 plasmid pRMG3 were identified following pfasmid DNA purification by restriction enzyme site analysis and 

nucleotide sequencing. 

Example S 

15 Construction of pRMG4 

A. Preparation of the 218 Base Pair Apa U- Nde l Restriction Fragment of pRMG1 

• Thirty u.g of plasmid pRMG1 was digested to completion with 120 units of ApaLI (New England Biolabs, 
Beverly MA) in a 600vtl reaction containing 10 mM Tris-HCI (pH 7.5), 10mM MgC^and 1mM dithiothreitol, 
and 100 |ig/ml bovine serum albumin at 37°C for two hours. The enzyme was thermally inactivated by heat- 

20 ing at 70°C for 1 0 min. The DNA was digested to completion with Nde l by supplementing the reaction with 

50 mM Tris-Hcl (pH 7.5), 100 mM Naci and 120 units Nde l (Boeh ringer Mannheim, Indianapolis, IN) in a 
750 \i\ reaction and incubating at 37°C for two hours. The enzyme was thermally inactivated by heating 
at 70°C for 10 min. The DNA was recovered by ethanol precipitation as described in example 1B and re- 
suspended in water. The desired 218 base pair fragment was conventionally purified on a 1.5% agarose 

25 gel by electrocution onto DEAE cellulose paper and resuspended in water. 

B. Preparation of the 247 Base Pair Msc l- Apa L1 Restriction Fragment of pRMG2 

Thirty ng of pRMG2 was digested to completion with 75 units (25pJ) Msd (an isoschizomer of Bal l, 
New England Biolabs, Beverly, MA) and 120 units (12^) Apa LI in a 750 reaction containing 10 mM Tris- 
HCI (pH 7.5), 1 0 mM MgClj, 1 mM dithiothreitol. and 1 00 jig/ml bovine serum albumin at 37°C for two hours. 
30 The enzymes were thermally inactivated by heating at 70°C for 10 min. The DNA was recovered by ethanol 
precipitation as described in example 1B and resuspended in water. The desired 247 base pair fragment 
was conventionally purified on a 1.5% agarose gel by electrocution onto DEAE cellulose paper and re- 
suspended in water. 

C. Preparation of the 211 Base Pair Msci- Bam HI Restriction Fragment of pRMG3 

35 Thirty \iq pf pRMG3 was digested to completion with 75 units (25 Msd in a 600 pJ reaction con- 

taining 50 mM potassium acetate, 20 mM Tris-acetate (pH 7.9), 10 mM magnesium acetate, 1 mM dithio- 
threitol, and 100 uo,/ml bovine serum albumin at 37°C for 2 hours. Tris-acetate is Trizma® acetate 
{Tris[hydroxymethyl]aminomethane acetate) and is available from Sigma Chemical Co., St Louis, MO 
63187. The enzyme was thermally inactivated at 70°C for 10 min. The DNA was digested to completion 

*o with Bam HI by supplementing the reaction with 50 mM NaCI and 120 units of Bam HI in a 750 id reaction 
and incubating at 37°C for 2 hours. The enzyme was thermally inactivated by heating at 70°C for 1 0 min. 
The DNA was recovered by ethanol precipitation as described in example 1B and resuspended in water. 
The desired 211 base pair fragment was conventionally purified on a 1.5% agarose gel by electrocution 
onto DEAE cellulose paper and resuspended in water. 

46 D. Preparation of pHKY390 expression vector 

Twenty jog of plasmid pHKY390 was digested to completion with 240 units Nde l (Boehringer Man- 
nheim, Indianapolis, IN) and 80 units Bam HI (Boehringer Mannheim, Indianapolis, IN) in a 100 microliter 
reaction containing 50mM Tris-HCI (pH 8.0), 10mM MgCI 2 , 100mM NaCI, and 100 jig/ml bovine serum al- 
bumin at 37°C for 1 nr. The enzymes were thermally inactivated by heating at 70°C for 10 min. 

so The 5* termini were dephosphorytated by treatment of the DNA wit h 5 units (5 ul) calf intestinal alkaline 

phosphatase (Boehringer Mannheim, Indianapolis, IN) at 37°C for 30 min. The enzyme was thermally in- 
activated by heating at 70*C for 15 min. The solution was extracted with an equal volume of phenol equi- 
ii bar ted with 100 mM Tris-HCI (pH 8.0). The aqueous layer was recovered and DNA was precipitated by 
the addition of 0.1 volume 3 M sodium acetate and 2.2 volumes of absolute ethanol. The DNA was collected 

55 by centrifugation and resuspended in 300 jil water. 

E. Final construction of pRMG4 

Two hundred ng of the purified 21 8 base pair fragment prepared in Example 5A, 200 ng of the purified 
247 base pair fragment purif ied in Example 5B, 200 ng of the purified 211 base pair fragment purified in 
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Example 5C, and 100 ng of the pHKY390 vector DNA prepared in Example 5D were ligated with 10 units 
of T4 DNA ligase (Boehringer Mannheim, Indianapolis IN) in a 20 \il reaction containing 50 mM Tris-HCI 
(pH 7.8), 10 mM MgCt 2 , 5 mM dithiothreitol, 5% glycerol, and 100 jiM adenosine triphosphate at 15°C for 
1 5 hours. 

5 A portion of the ligation mixture was used to transform B. cofi K12 MM294 celts. Transform ants were 

selected on L agar containing 10 u.g/ml tetracycline. Tetracycline resistant transformants containing the 
desired ptasmid pRMG4 were identified following plasmid DNA purification by restriction enzyme site ana- 
lysis and nucleotide sequencing of the trypsin gene. 

w Example 6 

Construction of pRMGS 

A. Preparation of the 225 base pair ApaL I-Hindlll restriction fragment of pRMG1 

Twenty ug of pRMG1 was digested to completion with 80 units of Apa L1 (New England Biolabs, Sev- 
15 erly, MA) in a 100 uJ reaction containing 10 mM Tris-HCI (pH 7.5), 10 mM MgCl 2 , 1 mM dithiothreitol, and 
100^g/ml bovine serum albumin at 37°C for 2 hours. The enzyme was thermally inactivated by heating at 
70°C for 10 min. The DNA was digested to completion with Hindlll by supplementing the reaction with 50 
mM Tris-HCI (pH 7.5), 50 mM NaCI, and 80 units of Hindlll in a 125 jJ reaction and incubating at 37°C for 
2 hours. The enzyme was thermally inactivated by heating at 70°C for 10 min. The DNA was recovered 
20 by ethanol precipitation as described in Example 2B and resuspended in water. The desired 225 base pair 
fragment was conventionally purified on a 1.5% agarose gel by electroelution onto DEAE cellulose paper 
and resuspended in water. 

B. Preparation of pRMG3 vector 

Thirty \ig of pRMG3 was digested to completion with 75 units of Mscl ( an isoschizomer of Bal l , New 
25 England Biolabs, Beverly, MA) in a 600 uJ reaction containing 50 mM potassium acetate, 20 mM Tris acet- 

ate, 10 mM magnesium acetate, 1 mM dithiothreitol, and 100 p.g/ml bovine serum albumin at 37°C for 2 
hours. The enzyme was thermally inactivated by heating at 70°C for 10 min. The DNA was digested to 
completion with Hindlll by supplementing the reaction with 50 mM Tris-HCI, 50 mM NaCI, and 120 units 
of Hindlll in a 750 \i\ reaction and incubating at 37°C for 2 hours. The enzyme was thermally inactivated 
30 at70°Cfor 10 min. 

The 5' termini were de phosphor ylated and the DNA was recovered by ethanol precipitation as descri- 
bed in Example 1B and resuspended in water. 

C. Final construction of pRMG5 

Two hundred ng of the purified 225 base pair fragment prepared in example 5A, 200 ng of the 247 
35 base pair fragment prepared in Example 4B, and 50 ng of the pRMG3 vector DNA prepared in Example 

6B were ligated with 10 units of T4 DNA ligase in a 40 \i\ reaction containing 50 mM Tris-Hcl (pH 7.8), 10 
mM MgCI 2 , 5 mM dithiothreitol, 5% glycerol, and 100 p.M adenosine triphosphate at 15°C for 15 hours. 

A portion of the ligation mixture was used to transform B. coli K12 MM294 cells. Transformants were 
selected on L agar containing 50 ng/ml ampicitlin. Ampicillin resistant transformants containing the desired 
40 plasmid pRMG5 were identified following plasmid DNA purification by restriction enzyme site analysis and 

nucleotide sequencing of the trypsin gene. 

Example 7 

45 Construction of pRMG6 

A. Preparation of the 45 base pair Nde l-Narl segment 

Seven ug of oligonucleotides BT19 and BT20 were individually phosphor ylated in 20 uJ reactions con- 
taining 50 mM Tris-HCI (pH7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 5% glycerol, 100 uM adenosine triphos- 
phate, and 20 units of T4 polynucleotide kinase (Boehringer Mannheim, Indianapolis, IN) at 37°C for 30 
so min. The kinase was thermally inactivated by heating at 70°C for 1 0 min. 

The two 20 |il reactions were subsequently mixed, then heated to 70°C for 5 min. and cooled to room 
temperature to allow the BT19 and BT20 oligonucleotides to anneal. 

B. Preparation of pRMG5 vector 

Twenty ug, of pRMG5 was digested to completion with 40 units of Nari (Bethesda Research Labora- 
55 tones, Gait hers burg, MD) in a 100 \i\ reaction containing 50 mM Tris-HCI (pH 8.0), 10 mM MgClj, and 

1 OO^/ml bovine serum albumin at 37°C for 2 hours. The enzyme was thermally inactivated by heating at 
70°C for 10 min. The DNA was digested to completion with Ndel by supplementing the reaction with 50 
mM NaCI and 80 units of Ndel in a 125 \i\ reaction and incubating at 37°C for 2 hours. The enzyme was 
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thermally inactivated by heating at 70°C for 10 min. 

The 5* termini were dephosphorylated and the DNA was recovered by ethanol precipitation as descri- 
bed in Example 1B and resuspended in water. 

C. Final construction of pRMG6 

Three hundred and fifty ng of the 45 base pair Nari-Ndel fragment prepared in Example 7A and 100 
ng of the pRMG5 vector DNA prepared in Example 7B were ligated with 10 units of T4 DNAIigase in a 20 
tU reaction containing 50 mM Tris-HCI (pH 7.8), 10 mM MgCI 2 , 5 mM dithiothreitol, 5% glycerol, and 100 
^lM adenosine triphosphate at 15°C at 15 hours. 

A portion of the ligation mixture was used to transform E cofi K12 MM294 cells. Transformants were 
selected on Lagar containing 50 ug/mt ampicillin. Am picill in- resistant transformants containing the desired 
pRMG6 DNA were identified following plasmid DNA purification by restriction enzyme site analysis and 
nucleotide sequencing of the trypsinogen gene. 



Example 8 

15 

Construction of pRMG7 

A. Preparation of the 695 base pair BamHI-Ndel trypsinogen gene 

Twenty ^g of plasmid pRMG6 was digested to completion with 36 units of BamHI (Boehringer Man- 
nheim, Indianapolis, IN) and 20 units of Ndel (New England Biolabs, Beverly, MA) in a 40 jjJ reaction con- 
20 taining 50 mM Tris-HCI (pH 8.0). 10 mM MgCI 2 . 100 mM NaCI, and 100 ug/ml bovine serum albumin at 

37°C for 1 hour. The enzymes were thermally inactivated by heating at 70°C for 10 min. The DNA was 
recovered by ethanol precipitation as described in Example 2B and resuspended in water. 

B. Final construction of pRMG7 

Three hundred and fifty ng of the restricted pRMG6 DNA prepared in Example BAand 100 ng of the 
25 pHKY390 vector DNA prepared in Example 5D were ligated wit h 1 0 units of T4 DNA ligase in a 25 ul reaction 

containing 50 mM Tris-HCI (pH 7.8), 10 mM MgClj, 5 mM dithiothreitol, 5% glycerol, and 100 adenosine 

triphosphate at 1 5°C for 1 5 hours. 

A portion of the ligation mixture was used to transform E. coti K12 MM294 cells. Transformants were 
selected on L agar containing 10 ^ig/ml tetracycline. Tetracycline-reststant transformants containing the 
30 desired plasmid pRMG7 were identified following plasmid DNA purification by restriction enzyme site ana- 

lysis and nucleotide sequencing of the trypsinogen- gene. 



Example 9 



35 Construction of L693/pRMG4 

A. Transformation of L693 with pRMG4 

The E. co/Zstrain L693 was transformed with plasmid pRMG4 DNA from Example 4E. Transformants were 
40 selected on L agar containing 10ug/ml tetracycline. Tetracycline- resistant transformants containing the desired 
plasmid pRMG4 were identified by restriction enzyme site analysis and nucleotide sequencing of the trypsin 
gene. 



Example 10 

45 

Construction of L687/pRMG7 

A. Transformation of L687 with pRMG7 

so The lorfE. coii strain L687 was transformed with 
were selected on L agar containing 10ug/ml tetracycl 
desired pRMG7 were identified by restriction enzyme 
gen gene. 



plasmid pRMG7 DNA from Example 8B. Transformants 
ine. Tetracycline-resistant transformants containing the 
site analysis and nucleotide sequencing of the trypsino- 
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(1) GENERAL INFORMATION: 



APPLICANT: 


ELI LILLY AND COMPANY 


(B) STREET: 


Lilly Corporate Center 


(C) CITY: 


Indianapolis 


(D) STATE: 


Indiana 


(E) COUNTRY: 


United States of America 


(F) ZIP : 


46285 



(ii) TITLE OF INVENTION: Expression Vectors for Bovine 
Trypsin and Trypsinogen and Host Cells Transformed Therewith 

(iii) NUMBER OF SEQUENCES: 24 

(iv) CORRESPONDENCE ADDRESS: 



(A) 


ADDRESSEE: 


C. M. Hudson 


(B) 


STREET: 


Erl Wood Manor 


(C) 


CITY: 


Windlesham 


(D) 


STATE : 


Surrey 


(E) 


COUNTRY : 


United Kingdom 


(F) 


ZIP : 


GU20 6PH 



(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.0 Mb 
storage 

(B) COMPUTER : Macintosh 

(C) OPERATING SYSTEM: Macintosh 

(D) SOFTWARE: Microsoft Word 
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(2) INFORMATION FOR SEQ ID NO: 1 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 
AGCTTCATAT GATCGTTGGC GGCTACACCT GTGGCGCCAA TACCGTCCCG 
TACCAGGTGT CCC TG AATTC TGGCTAC 



(2) INFORMATION FOR SEQ ID NO: 2 

* 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 base pairs 

(B) . TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 
Sequence I.D 2 

AGTGGTAGCC AGAATTCAGG GACACCTGGT ACGGGACGGT ATTGGCGCCA 
CAGGTGTAGC CGCCAACGAT CATATGA 
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(2) INFORMATION FOR 5EQ ID NO: 3 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 



Sequence I.D. 3) (Sequence Length: 81) 

CACTTCTGTG GTGGCTCCCT CATCAACTCC CAGTG 



50 



CCACTGCTAC AAGTCCGGCA TCCAGGTGCG T 81 



(2) INFORMATION FOR SEQ ID NO: 4 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 



Sequence I.D. 4 



CCAGACGCAC CTGGATGCCG GACTTGTAGC AGTGGGCCGC TGATACCACC 50 
CACTGGGAGT TGATGAGGG AGCCACCACAG A 81 



40 



45 



50 
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(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDMES SS: single stranded 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
CTGGGCGAGG ATAACATCAA CGTCGTGGAG GGCAATGAGC AGTTCATCTC 
CGCATCCAAG TCCATCGTGC ACT 



(2) INFORMATION FOR SEQ ID NO: 6 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 73 base pairs 
<B) TYPE: Nucleic acid 

(C) STRANDEDNESSS': double stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 



CTAGAGTGCA CGATGGACTT GGATGCGGAG ATGAACTGCT CATTGCCCTC 
CACGACGTTG ATGTTATCCT CGC 



(8) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 84 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS; single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 7 
AGCTTCATCG TGCACCCGTC CTACAACTCC AACACTCTGA ACAATGACAT 
CATGCTGATC AAGCTCAAGT CCGCCGCATC CCTG 



(2) INFORMATION FOR SEQ ID NO: 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 



AGTTCAGGGA TGCGGCGGAC TTGAGCTTGA TCAGCATGAT GTCATTGTTC 
AGAGTGTTGG AGTTGTAGGA CGGGTGCACG ATGA 
(2) INFORMATION FOR SEQ ID NO: 9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 3 base pairs 

(B) TYPE: Nucleic Acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 



AACTCCCGCG TGGCCTCCAT CTCTCTGCCG ACCTCCTGTG CCTCCGCCGG 
CACGCAGTGC CTCATCTCTG GCTGGGGCAA CACTAAGAGC TCT 



(2) INFORMATION FOR SEQ ID NO: 10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 

TGCCAGAGCT CTTAGTGTTG CCCCAGCCAG AGATGAGGCA CTGCGTGCCG 
GCGGAGGCAC AGGAGGTCGG CAGAGAGATG GAGGCCACGC GGG 
(2) INFORMATION FOR SEQ ID NO: 11 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
GGCACCTCCT ACCCAGACGT GCTGAAGTGC CTGAAGGCTC CTATCCTGAG 
CGATTCCTCC TGTAAGTCCG CCTACCCTGG CCAGATTT 



(2) INFORMATION FOR SEQ ID NO: 12 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 



CTAGAAATCT GGCCAGGGTA GGCGGACTTA CAGGAGGAAT CGCTCAGGAT 
AGGAGCCTTC AGGCACTTCA GCACGTCTGG GTAGGAGG 

(2) INFORMATION FOR SEQ ID NO: 13 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 77 base pairs 
(3) TYPE: Nucleic acid 
(C) STRANDEDNESSS: single stranded 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 

AGCTTCCTGG CCAGATTACC AGCAACATGT TCTGTGCCGG CTACCTGGAG 
GGCGGCAAGG ATTCCTGCTA GGGTGAT 

(2) INFORMATION FOR SEQ ID NO: 14 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS : single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 

CAGAATCACC CTGACAGGAA TCCTTGCCGC CCTCCAGGTA GCCGGCACAG 
AACATGTTGC TGGTAATCTG GCCAGGA 

(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 76 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 

TCTGGTGGCC CTGTGGTCTG CTCCGGCAAG CTCCAAGGCA TCGTCTCCTG 
GGGTTCCGGC TGTGCCCAGA AGAACA 

(2) INFORMATION FOR SEQ ID NO: 16 
(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 76 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNSSSS : single stranded 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 



GGCTTGTTCT TCTGGGCACA GCCGGAACCC CAGGAGACGA TGCCTTGGAG 
CTTGCCGGAG CAGACCACAG GGCCAC 

* 



(2) INFORMATION FOR SEQ ID NO: 17 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 
(D J TOPOLOGY; linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 



AGCCTGGCGT CTACACCAAG GTCTGTAACT ATGTGTCCTG GATTAAGCAG 
ACCATAGCTT CCAATTAGGA TCCT 
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(2) INFORMATION FOR SEQ ID NO: 18 

i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 base pairs 

(B) TYPE : Nucleic acid 

(C} STRANDEDNESSS: single stranded 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 



CTAGAGGATC CTAATTGGAA GCTATGGTCT GCTTAATCCA GGACACATAG 
TTACAGACCT TGGTGTAGAC GCCA 



(2) INFORMATION FOR SEQ ID NO: 19 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 



TATGGTGGAT GATGATGATA AGATCGTTGG CGGCTACACC TGTGG 



(2) INFORMATION FOR SEQ ID NO: 20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 



CGCCACAGGT GTAGCCGCCA ACGATCTTAT CATCATCATC CACCA 



15:03:07 page -24 



EP 0 597 681 A1 



10 



15 



55 



<2) INFORMATION FOR SEQ ID NO: 21 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 683 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESSS: double stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 





CAT 


ATG 


ATC 




GGC 


GGC 


TAC 


ACC 


TCT 


GGC 


GCC 


AAT 


ACC 


GTC 


CCG 


45 






Met 


lie 


Val 


Gly 


Gly 


Tyr 


Thr 


Cys 


Gly Ala 


Asn 


Thr 


Val 

» W» m+ 


Pro 






TAC 


CAG 


GTG 


TCC 


CTG 


AAT 


TCT 


GGC 


TAC 


CAC 


TTC 


TGT 


GGT 


GGC 


TCC 


90 


20 


Tyr 


Gin 


Val 


Ser 


Leu 


Asn 


Ser 


Gly 


Tyr 


His 


Phe 


Cvs 


GlV 


Glv 


Ser 


29 

*m m* 




CTC 


ATC 


AAC 


TCC 


CAG 


TGG 


GTG 


GTA 


TCA 


GCG 


GCC 


CAC 


TGC 


TAC 


AAG 


135 




Leu 


IU 

^™ 


Asn 


Ser 


Gin 


Trp 


Val 


Val 


Ser 


Ala 


Ala 


His 


Cvs 


TV it 


Lvs 


44 


25 


TCC 


GGC 


ATC 


CAG 


GTG 


CGT 


CTG 


GGC 


GAG 


GAT 


AAC 


ATC 


AAC 


GTC 


GTG 


180 

A V v 




Ser 


Gly 


lie 


Gin 


Val 


Arg 


Leu Gly Glu 


Asp 


Asn 


He 


Asn 


Val 


Val 


59 




GAG 


GGC 


AAT 


GAG 


CAG 


TTC 


ATC 


TCC 


GCA 


TCC 


AAG 


TCC 


ATC 


GTG 


CAC 


225 


30 


Glu 


Gly 


Asn 


Glu 


Gin 


Fhe 


He 


Ser 


Ala 


Ser 


Lys 


Ser 


He 


Val 


His 


74 




CCG 


TCC 


TAC 


AAC 


TCC 


AAC 


ACT 


CTG 


AAC 


AAT 


GAC 


ATC 


ATG 


CTG 


ATC 


270 




Pro 


Ser 


Tyr 


Asn 


Ser 


Asn 


Thr 


Leu 


Asn 


Asn 


Asp 


He' 


Met 


Leu 


He 


89 


35 


AAG 


CTC 


AAG 


TCC 


GCC 


GCA 


TCC 


CTG 


AAC 


TCC 


CGC 


GTG 


GCC 


TCC 


ATC 


315 


Lys 


Leu 


Lys 


Ser 


Ala 


Ala 


Ser 


Leu 


Asn 


Ser 


Arg 


Val 


Ala 


Ser 


He 


104 




TCT 


CTG 


CCG 


ACC 


TCC 


TCT 


GCC 


TCC 


GCC 


GGC 


ACG 


CAG 


TGC 


CTC 


ATC 


360 




Ser 


Leu 


Pro 


Thr 


Ser 


Cys 


Ala 


Ser 


Ala 


Gly 


Thr 


Gin 


Cys 


Leu 


He 


119 


40 


TCT 


GGC 


TGG 


GGC 


AAC 


ACT 


AAG 


ACC 


TCT 


GGC 


ACC 


TCC 


TAC 


CCA 


GAC 


405 




Ser 


Gly 


Trp 


Gly 


Asn 


Thr 


Lys 


Ser 


Ser 


Gly 


Thr 


Ser 


Tyr 


Pro 


Asp 


134 




GTG 


CTG 


AAG 


TGC 


CTG 


AAG 


OCT 


CCT 


ATC 


CTG 


AGC 


GAT 


TCC 


TCC 


TGT 


450 




Val 


Leu 


Lys 


Cys 


Leu 


Lys 


Ala 


Pro 


He 


Leu 


Ser 


Asp 


Ser 


Ser 


Cys 


149 


45 


AAG 


TCC 


GCC 


TAC 


CCT 


GGC 


CAG 


ATT 


ACC 


AGC 


AAC 


ATG 


TTC 


TGT 


GCC 


495 




Lys 


Ser 


Ala 


Tyr 


Pro 


Gly 


Gin 


He 


Thr 


Ser 


Asn 


Met 


Phe 


Cys 


Ala 


164 




GGC 


TAC 


CTG 


GAG 


GGC 


GGC 


AAG 


GAT 


TCC 


TGT 


CAG 


GGT 


GAT 


TCT 


GGT 


540 




Gly 


Tyr 


Leu 


Glu 


Gly 


Gly 


Lys 


Asp 


Ser 


Cys 


Gin 


Gly 


Asp 


Ser 


Gly 


179 


50 


GGC 


CCT 


GTG 


GTC 


TGC 


TCC 


GGC 


AAG 


CTC 


CAA 


GGC 


ATC 


GTC 


TCC 


TGG 


585 




Gly 


Pro 


Val 


Val 


Cy3 


Ser 


Gly 


Lys 


Leu 


Gin Gly 


He 


val 


Ser 


Trp 


194 



25 
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GGT TCC GGC TGT GCC CAG AAG AAC AAC CCT GGC GTC TAC ACC AAG 630 

Gly Ser Gly Cys Ala Gin Lys Asn Lys Pro Gly Val Tyr Thr Lys 2 09 

GTC TGT AAC TAT GTG TCC TGC ATT AAG CAG ACC ATA GCT TCC AAT 675 

Val Cys Asn Tyr Val Ser Trp lie Lys Gin Thr He Ala Ser Asn 224 



taggatcc 



683 



10 

(2) INFORMATION FOR SEQ ID NO: 22 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 

(B) TYPE: protein 

«■ (C) STRANDEDNESSS : single stranded 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; Protein 



20 



25 



30 





(xi) SEQUENCE DESCRIPTION: 


SEQ ID 


NO: 


22 










Met 


lie 


Val 


Gly 


Gly 


Tyr 


Thr 


Cys 


Gly 


Ala 


Asn 


Thr 


Val 


Pro 


Tyr 


15 


Gin 


Val 


Ser 


Leu 


Asn 


Ser 


Gly 


Tyr 


His 


Phe 


Cys 


Gly 


Gly 


Ser 


Leu 


30 


lie 


Asn 


Ser 


Gin 


Trp 


Val 


Val 


Ser 


Ala 


Ala 


His 


Cys 


Tyr 


Lys 


Ser 


45 


Gly 


lie 


Gin 


Val 


Arg 


Leu 


Gly 


Glu 


Asp 


Asn 


lie 


Asn 


Val 


Val 


Glu 


60 


Gly 


Asn 


Glu 


Gin 


Phe 


He 


Ser 


Ala 


Ser 


Lys 


Ser 


He 


Val 


His 


Pro 


75 


Ser 


Tyr 


Asn 


Ser 


Asn 


Thr 


Leu 


Asn 


Asn 


Asp 


He 


Met 


Leu 


He 


Lys 


90 


Leu 


Lys 


Ser 


Ala 


Ala 


Ser 


Leu 


Asn 


Ser 


Arg 


Val 


Ala 


Ser 


He 


Ser 


105 


Leu 


Pro 


Thr 


Ser 


Cys 


Ala 


Ser 


Ala 


Gly 


Thr 


Gin 


Cys 


Leu 


He 


Ser 


120 


Gly 


Trp 


Gly 


Asn 


Thr 


Lys 


Ser 


Ser 


Gly 


Thr 


Ser 


Tyr 


Pro 


Asp 


Val 


135 


Leu 


Lys 


Cys 


Leu 


Lys 


Ala 


Pro 


He 


Leu 


Ser 


Asp 


Ser 


Ser 


Cys 


Lys 


150 


Ser 


Ala 


Tyr 


Pro 


Gly 


Gin 


He 


Thr 


Ser 


Asn 


Met 


Phe 


Cys 


Ala 


Gly 


165 


Tyr 


Leu 


Glu 


Gly 


Gly 


Lys 


Asp 


Ser 


Cys Gin 


Gly 


Asp 


Ser 


Gly Gly 


180 


Pro 


Val 


Val 


Cys 


Ser 


Gly 


Lys 


Leu 


Gin 


Gly 


He 


Val 


Ser 


Trp Gly 


195 


Ser 


Gly 


Cys 


Ala 


Gin 


Lys 


Asn 


Lys 


Pro Gly 


Val 


Tyr 


Thr 


Lys 


Val 


210 


Cys 


Asn 


Tyr 


Val 


Ser 


Trp 


He 


Lys 


Gin 


Thr 


lie 


Ala 


Ser 


Asn 




224 



35 



40 



45 



50 
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(2) INFORMATION FOR SEQ ID NO: 23 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 01 base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNES SS : single stranded 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 3 

15 





CAT 


ATG 
Met 


CTC 
Val 


CAT 
Asp 


GAT 
Asp 


GAT 
Asp 


GAT 
Asp 


AAG 
Lys 


ATC 
lie 


GTT 
Val 


GGC 
Gly 


GGC TAC 
Gly Tyr 


ACC TGT 
Thr Cys 


45 

14 


20 


GGC 
Gly 


GCC 
Ala 


AAT 
Asn 


ACC 
Thr 


GTC 
Val 


CCG 
Pro 


TAC 
Tyr 


CAG 
Gin 


GTG 
Val 


TCC 
Ser 


CTG 
Leu 


AAT 
Asn 


TCT 
Ser 


GGC 
Gly 


TAC 
Tyr 


90 
29 




CAC 
His 


TTC 
Phe 


TGT 
Cys 


GGT 
Gly 


GGC 
Gly 


TCC 
Ser 


CTC 
Leu 


ATC 
He 


AAC 
Asn 


TCC 
Ser 


CAG 
Gin 


TGG 
Trp 


GTG 
Val 


GTA 
Val 


TCA 
Ser 


135 
44 


25 


GCG 
Ala 


GCC 
Ala 


CAC 
His 


TGC 
Cys 


TAC 
Tyr 


AAG 
Lys 


TCC 
Ser 


GGC 
Gly 


ATC 
lie 


CAG 
Gin 


GTG 
Val 


CGT 
Arg 


CTG 
Leu 


GGC 
Gly 


GAG 
Glu 


180 
59 




GAT 
Asp 


AAC 
Asn 


ATC 
lie 


AAC 

Asn 


GTC 
Val 


GTG 
Val 


GAG 
Glu 


GGC 
Gly 


AAT 
Asn 


GAG 
Glu 


CAG 
Gin 


TTC 
Phe 


ATC 
lie 


TCC 
Ser 


GCA 
Ala 


225 
74 


30 


TCC 

Ser 


AAG 

Lys 


TCC 
Ser 


i 

ATC 

He 


GTG 
Val 


CAC 
His 


CCG 

Pro 


TCC 
Ser 


TAC AAC 

Tyr Asn 


TCC 
Ser 


AAC 
Asn 


ACT 
Thr 


CTG 
Leu 


AAC 
Asn 


270 
89 


35 


AAT 
Asn 


GAC 
Asp 


ATC 
He 


ATG 
Met 


CTC 
Leu 


ATC 
lie 


AAG 

Lys 


CTC 
Leu 


AAG 
Lys 


TCC 
Ser 


GCC 
Ala 


GCA 
Ala 


TCC 
Ser 


CTG 
Leu 


AAC 
Asn 


315 
104 




TCC 

Ser 


CGC 

Arg 


CTC 
Val 


GCC 
Ala 


TCC 

Ser 


ATC 
lie 


TCT 
Ser 


CTC 
Leu 


CCG 

Pro 


ACC 

Thr 


TCC 
Ser 


TGT 
Cys 


GCC 
Ala 


TCC 
Ser 


GCC 
Ala 


360 
119 


46 


GGC 
Gly 


ACG 
Thr 


CAG 

Gin 


TGC 

Cys 


CTC 
Leu 


ATC 
He 


TCT 
Ser 


GGC TGG GGC 

Gly Trp Gly 


AAC 
Asn 


ACT 
Thr 


AAG 
Lys 


AGC 

Ser 


TCT 
Ser 


405 
134 




GGC 
Gly 


ACC 
Thr 


TCC 
Ser 


TAC 
Tyr 


CCA 
Pro 


GAC 
Asp 


GTG 
Val 


CTC 
Leu 


AAG 
Lys 


TGC 
Cys 


CTG 
Leu 


AAG 

Lys 


GCT 
Ala 


CCT ATC 
Pro He 


450 
149 


45 


CTG 
Leu 


AGC 
Ser 


GAT 
Asp 


TCC 
Ser 


TCC 
Ser 


TGT 
Cys 


AAG 
Lys 


TCC 
Ser 


GCC 
Ala 


TAC 
Tyr 


CCT 
Pro 


GGC CAG 

Gly Gin 


ATT 
lie 


ACC 
Thr 


495 

164 




AGC 
Ser 


AAC 
Asn 


ATG 

Met 


TTC 
Phe 


TGT 

Cys 


GCC 

Ala 


GGC 

Gly 


TAC 

Tyr 


CTG 
Leu 


GAG 
Glu 


GGC 
Gly 


GGC 
Gly 


AAG 
Lys 


GAT 
Asp 


TCC 
Ser 


540 
179 


SO 


TGT 
Cys 


CAG 
Gin 


GGT 
Gly 


GAT 

A&p 


TCT- 
Ser 


GGT 
Gly 


GGC 

Gly 


CCT 
Pro 


GTG 
Val 


GTC 
Val 


TGC 
Cys 


TCC 

Ser 


GGC 
Gly 


AAC 
Lys 


CTC 
Leu 


585 
194 




CAA 


GGC 


ATC 


GTC 


TCC 


TGC 


GGT 


TCC 


GGC 


TGT 


GCC 


CAG 


AAG 


AAC 


AAG 


€30 



55 
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10 



20 



Gin Gly He Val Ser Trp Gly Ser Gly Cys Ala Gin Lys Asn Lys 209 

CCT GOC GTC TAC ACC AAG GTC TGT AAC TAT GTG TCC TGG ATT AAG 675 

Pro Gly Val Tyr Thr Lys Val Cys Asn Tyr Val Ser Trp He Lys 224 

CAG ACC ATA GCT TCC AAT TAGGATCC 701 

Gin Thr Tie Ala Ser Asn 230 



(2) INFORMATION FOR SEQ ID NO: 24 

(i) SEQUENCE CHARACTERISTICS: 
is (a) LENGTH: 2 30 base pairs 

(B) TYPE: protein 

(C) STRANDEDNESSS: single stranded 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 





Met 


Val 


Asp 


Asp 


Asp 


Asp 


Lys 


He 


Val 


Gly 


Gly 


Tyr 


Thr 


Cys 


Gly 


15 


25 


1 Ala 


Asn 


Thr 


Val 


Pro 


Tyr 


Gin 


Val 


Ser 


Leu 


Asn 


Ser Gly 


Tyr 


His 


30 




Phe 


Cys 


Gly 


Gly 


Ser 


Leu 


He 


Asn 


Ser 


Gin 


Trp 


Val 


Val 


Ser 


Ala 


45 




Ala 


His 


Cys 


Tyr 


Lys 


Ser 


Gly 


He 


Gin 


Val 


Arg 


Leu 


Gly 


Glu 


Asp 


60 


30 


Asn 


He 


Asn 


Val 


Val 


Glu 


Gly 


Asn 


Glu 


Gin 


Phe 


He 


Ser 


Ala 


Ser 


75 




Lys 


Ser 


He 


Val 


His 


Pro 


Ser 


Tyr 


Asn 


Ser 


Asn 


Thr 


Leu 


Asn 


Asn 


90 




Asp 


He 


Met 


Leu 


He 


Lys 


Leu 


Lys 


Ser 


Ala 


Ala 


Ser 


Leu 


Asn 


Ser 


105 


35 


• 

Arg 


Val 


Ala 


Ser 


He 


Ser 


Leu 


Pro 


Thr 


Ser 


Cys 


Ala 


Ser 


Ala Gly 


120 




Thr 


Gin 


Cys 


Leu 


He 


Ser 


Gly 


Trp 


Gly 


Asn 


Thr 


Lys 


Ser 


Ser Gly 


135 




Thr 


Ser 


Tyr 


Pro 


Asp 


Val 


Leu 


Lys 


Cys 


Leu 


Lys 


Ala 


Pro 


lie 


Leu 


150 


40 


Ser 


Asp 


Ser 


Ser 


Cys 


Lys 


Ser 


Ala 


Tyr 


Pro 


Gly 


Gin 


He 


Thr 


Ser 


165 




Asn 


Met 


Phe 


Cys 


Ala 


Gly 


Tyr 


Leu 


Glu 


Gly 


Gly 


Lys 


Asp 


Ser 


Cys 


180 




Gin 


Gly 


Asp 


Ser 


Gly 


Gly 


Pro 


Val 


Val 


Cys 


Ser 


Gly Lys 


Leu 


Gin 


195 


45 


Gly 


He 


Val 


Ser 


Trp 


Gly 


Ser Gly 


Cys 


Ala 


Gin 


Lys 


Asn 


Lys 


Pro 


210 




Gly 


Val 


Tyr 


Thr 


Lys 


Val 


Cys 


A3n 


Tyr 


Val 


Ser 


Trp 


He 


Lys 


Gin 


225 




Thr 


He 


Ala 


Ser 


Asn 






















230 



SO 



55 Claims 

1. A recombinant DNA expression vector comprising the DNA sequence of Sequence i.D. 21 . 

28 
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2. The vector of claim 1 that is plasm id pRMG4. 

3. A recombinant DNA expression vector comprising the DNA sequence of Sequence ID. 23. 

4. The vector of claim 3 that is pi asm id pRMG7. 

5. A method of producing bovine trypsin comprising culturing a host celt transformed with the vector of claim 
1 under conditions appropriate for production of bovine trypsin. 

6. The method of claim 5 wherein said host cell is a lorv host cell. 

7. A method of producing bovine trypsinogen comprising culturing a host cell transformed with the vector 
of claim 3 under conditions appropriate for production of bovine trypsinogen. 

B. The method of claim 7 wherein said vector is pi asm id pRMG7. 

9. The method of claim 7 wherein said host ceil is a Ion- host cell. 



10. A method of producing bovine trypsin comprising 

(a) culturing a host cell transformed with the vector of claim 3 under conditions appropriate for produc- 
^ tion of bovine trypsinogen 

(b) recovering the trypsinogen from step (a) and 

(c) enzymatically converting the trypsinogen to trypsin 



25 



11. A method for converting human proinsulin to human insulin comprising treating human proinsulin with bio- 
synthebcally produced trypsin. 



30 



35 



40 

■ i 

H * 



45 



50 
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FIG. 2 
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FIG. 3 
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FIG. 4 



EcoRI 



Ndel 
Narl 




ApaLl 



BamHI 



PvuII 



33 



15:03:07 page -33 



BP 0 597 6S1 A1 



FIG. 5 
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